Parallel Input/Output Impact on Sparse Matrix Compression

نویسندگان

  • Sorin G. Nastea
  • Tarek A. El-Ghazawi
  • Ophir Frieder
چکیده

Sparse matrices efficiently store structured information, particularly when represented in compressed formats. The advantages of using compressed formats rather than expanded representations are reduced storage space and faster computation achieved by avoiding processing the zero elements. We address the I70 bottleneck associated with the compression operation. We show that such bottleneck can be reduced ifparallel I/O techniques are used. We study the several available Parallel File System (PFS) tile access modes available on an IntelParagon with 64 processing nodes (among whom 56 are compute nodes and 3 are I/O nodes). Such PFS tile access modes include: M-UNIX, M-LOG, M-SYNC, M-RECORD, M GLOBAL, and M-ASYNC. The main differences among these PFS tile access modes are jndi&rent imposed concurrency constraints, synchronization methods, and the way the tie pointer is maintained. We present the compression time results and we show that I/O becomes preponderant as the number of processors increases. To minimize the compression time, we used asynchronous read, thus letting the compute nodes work on compression while the I/O nodes provide the next set of raw data. To maximize I/O performance, we read each time a multiple number of rows whose combined size is an integer number of stripe-sizes. In Figure 1, we present compression time results for PSMIGR 1, a sparse matrix selected from the Harwell-Boeing collection. Our experiments show that the best results can be obtained with M-ASYNC PFS tile access mode. In this case, the read size is equal to the stripe-size (64 KB) multiplied by the number of I/O nodes (3). The deterioration of the compression time when the M GLOBAL mode is used is due to excessive paging when the number of nodes increases. This is due to the fact that, iu this mode, each processing node gets a full copy of the data read by ah others, which can exceed the local buffer size.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simultaneous Input and Output Matrix Partitioning for Outer-Product-Parallel Sparse Matrix-Matrix Multiplication

For outer-product–parallel sparse matrix-matrix multiplication (SpGEMM) of the form C=A×B, we propose three hypergraph models that achieve simultaneous partitioning of input and output matrices without any replication of input data. All three hypergraph models perform conformable one-dimensional (1D) columnwise and 1D rowwise partitioning of the input matrices A and B, respectively. The first h...

متن کامل

Constrained Fine-Grain Parallel Sparse Matrix Distribution

We consider how to distribute sparse matrices among processors to reduce communication cost in parallel sparse matrix computations, in particular, sparse matrix-vector multiplication. We allow 2d distributions, where the distribution (partitioning) is not constrained to just rows or columns. The fine-grain model is a 2d distribution introduced in [2] where nonzeros can be assigned to processors...

متن کامل

Exploiting random projections and sparsity with random forests and gradient boosting methods - Application to multi-label and multi-output learning, random forest model compression and leveraging input sparsity

Within machine learning, the supervised learning field aims at modeling the input-output relationship of a system, from past observations of its behavior. Decision trees characterize the input-output relationship through a series of nested $if-then-else$ questions, the testing nodes, leading to a set of predictions, the leaf nodes. Several of such trees are often combined together for state-of-...

متن کامل

Deblocking Joint Photographic Experts Group Compressed Images via Self-learning Sparse Representation

JPEG is one of the most widely used image compression method, but it causes annoying blocking artifacts at low bit-rates. Sparse representation is an efficient technique which can solve many inverse problems in image processing applications such as denoising and deblocking. In this paper, a post-processing method is proposed for reducing JPEG blocking effects via sparse representation. In this ...

متن کامل

Sampling and Analytical Techniques for Data Distribution of Parallel Sparse Computation

We present a compile{time method to select compression and distribution schemes for sparse matrices which are computed using Fortran 90 array intrinsic operations. The selection process samples input sparse matrices to determine their sparsity structures. It is also guided by cost functions of various sparse routines as measured from the target machine. The Fortran 90 array expression is then t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996